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ABSTRACT 

This paper presents a new linear hyperspectral unmixing methiod 
of the minimum volume class, termed simplex identification via split 
augmented Lagrangian (SISAL). Following Craig's seminal ideas, 
hyperspectral linear unmixing amounts to finding the minimum vol- 
ume simplex containing the hyperspectral vectors. This is a noncon- 
vex optimization problem with convex constraints. In the proposed 
approach, the positivity constraints, forcing the spectral vectors to 
belong to the convex hull of the endmember signatures, are replaced 
by soft constraints. The obtained problem is solved by a sequence of 
augmented Lagrangian optimizations. The resulting algorithm is very 
fast and able so solve problems far beyond the reach of the current 
state-of-the art algorithms. The effectiveness of SISAL is illustrated 
with simulated data. 

Index Terms — Hyperspectral unmixing, Minimum volume sim- 
plex. Variable Splitting augmented Lagrangian, nonsmooth optimiza- 
tion. 



1. INTRODUCTION 

Hyperspectral unmixing is a source separation problem |T|. Com- 
pared with the canonical source separation scenario, the sources in 
hyperspectral unmixing (i.e., the materials present in the scene) are 
statistically dependent and combine in a linear or nonlinear fashion. 
These characteristics, together with the high dimensionality of hy- 
perspectral vectors, place the unmixing of hyperspectral mixtures be- 
yond the reach of most source separation algorithms, thus fostering 
active research in the field |2|. 

Given a set of mixed hyperspectral vectors, linear mixture anal- 
ysis, or linear unmixing, aims at estimating the number of reference 
materials, also called endmembers, their spectral signatures, and their 
abundance fractions (HIHSEIEIISI. The approaches to hyperspectral 
linear unmixing can be classified as either statistical or geometrical. 
The former address spectral unmixing as an inference problem, often 
formulated under the Bayesian framework, whereas the latter exploit 
the fact that the spectral vectors, under the linear mixing model, are 
in a simplex whose vertices represent the sought endmembers. 

1.1. Statistical approach to spectral unmixing 

Modeling the abundance fractions (sources) statistical dependence in 
hyperspectral unmixing is a central issue in the statistical framework. 
In (7 1, the abundance fractions are modeled as mixtures of Dirichlet 
densities. The resulting algorithm, termed DECA (dependent com- 
ponent analysis), implements an expectation maximization iterative 
scheme for the inference of the endmember signatures (mixing ma- 
trix) and the density parameters of the abundance fractions. 

The inference engine in the Bayesian framework is the poste- 
rior density of the entities to be estimated, given the observations. 
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According to Bayes' law, the posterior includes two factors: the ob- 
servation density, which may account for additive noise, and a prior, 
which may impose constraints on the endmember matrix {e.g., non- 
negativity of its elements) and on the abundance fractions (e.g., to 
be in the probability simplex) and model spectral variability. Works 
jSllgl are representative of this line of attack. 

1.2. Geometrical approach to spectral unmixing 

The geometrical approach exploits the fact that, under the linear mix- 
ing model, hyperspectral vectors belong to a simplex set whose ver- 
tices correspond to the endmembers. Therefore, finding the endmem- 
bers is equivalent to identifying the vertices of the referred to simplex. 

If there exists at least one pure pixel per endmember (i.e., con- 
taining just one material), then unmixing amounts to finding the spec- 
tral vectors in the data set corresponding to the vertices of the data 
simplex. Some popular algorithms taking this assumption are the 
the N-FINDR |10|, the the pixel purity index (PPI) |11|, the auto- 
mated morphological endmember extraction ( AMEE) 1 1 2 1 , the vertex 
component analysis (VCA) |4|, and the simplex growing algorithm 
(SGA) fT3). 

If the pure pixel assumption is not fulfilled, what is a more re- 
alistic scenario, the unmixing process is a rather challenging task, 
since the endmembers, or at least some of them, are not in the data 
set. A possible line of attack, in the vein of the seminal ideas intro- 
duced in 1 6 1, is to fit a simplex of minimum volume to the data set. 
Relevant works exploiting this direction are the minimum volume en- 
closing simplex (MVES) 1 1^, the minimum volume simplex analysys 
(MVS A) 1 15], and the nonnegative matrix factorization minimum 
volume transfi)rm (NMF-MVT) |16|. MVES and MVS A, although 
implemented in rather different ways, yield state-of-the-art results. 
Their major shortcoming is the time they take for more than, say, 10 
endmembers and more than 5000 spectral vectors. 

1.3. Proposed approach 

We introduce the simplex identification via split augmented La- 
grangian (SISAL) algorithm for unsupervised hyperspectral linear 
unmixing. SISAL belongs to the minimum volume class, and thus 
is able to unmix hyperspectral data sets in which the pure pixel 
assumption is violated. 

In SISAL, the positivity hard constraints are replaced by hinge 
type soft constraints, whose strength is controlled by a regularization 
parameter. This replacement has three advantages: I) robustness to 
outliers and noise; 2) robustness to poor initialization; 3) opens the 
door to dealing with large problems. Furthermore, for large values 
of the regularization parameter, the hard constraint formulation is re- 
covered. 

To tackle the hard nonconvex optimization problem we have in 
hands, we solve a sequence of nonsmooth convex subproblems, using 
variable splitting to obtain a constraint formulation, and then apply- 
ing an augmented Lagrangian technique 1 17, 18 1. This sub-problems 
implement an alternate minimization scheme, with very simple and 
fast steps. 



The paper is organized as follows. Section 2 formulates the prob- 
lem. Section 3 proposes a sequence of nonsmooth subproblems, Sec- 
tion 4 solves the nonsmooth problems via a variable splitting aug- 
mented Lagrangian scheme, Section 5 presents results, and Section 6 
ends the paper by presenting a few concluding remarks. 

2. PROBLEM FORMULATION 

Let us assume that in a given scene there are p materials, termed end- 
members, with spectral signatures mi G R', for i — 1 . . . ,p, where 
I >p denotes the number of spectral bands. Under the linear mixing 
model, a given hyperspectral observed vector is a linear combination 
of the endmember spectral signatures, where the weights represent 
the fractions that each material occupies in the pixel. Therefore, the 
observed spectral vectors are in the convex hull of endmember spec- 
tral signatures. 

To fix notation, let Y = [yi,...,y„] G K'^" denote a 
matrix holding the observed spectral vectors G R' and S = 
[si, . . . , s„] G R*"^" a matrix holding the respective fractions; i.e., 
yi = Msi, for i = 1, . . . , n, where M = [mi, . . . , nip] G R'"''' 
is the mixing matrix containing the endmembers, and Si is a vector 
denoting the fractions, often termed fractional abundances. Since 
the components of Si are nonnegative and sum one (they are frac- 
tions), then the fractional abundance vectors belong to the standard 
p-simplex set iSp = {s G R'' : s >^ 0, s — 1}Q Therefore, 



MS, 



S G S^. 



(1) 



Assuming that the endmember spectral signatures m^, for i 



,p, are linearly independent, then the set {y G 



y = 



Ms, s G Sp} is a (p — 1) -dimensional simplex, and estimating M 
amounts to infer its vertices. Figure[T]illustrates this perspective. 




C = conv{M) 
2-simplex 

Fig. 1. Illustration of the 2-simplex set generated the columns of M. 

In this work, we assume that the number of endmembers and 
signal subspace is known before hand (see, e.g., 1191 ) and that the 
observed vectors yi,i = 1, . . . ,n, represent coordinates with respect 
to a p-dimensional basis of the signal subspace, i.e., I — p. 

Given Y, and inspired by the seminal work |6 1, we infer matrices 
M and S by fitting a minimum volume simplex to the data subject to 
the constraints S ^ and S — 1„. Since the volume defined 
by the columns of M is proportional to |det (M)| (note that we are 
assuming that M is square), then 



M* = argmin I det(M)| 

s.t. : QY>rO, iJ'QY^l^, 



(2) 



where Q = M ^ . Since det(Q) — 1/ det(M), we can replace the 
problem ^ with 



Q* — arg nnn — log | det(Q) [ 
s.t.: QY y 0, ijQY = 1^. 



(3) 



The constraints in l|3j define a convex set. If matrix Q is sym- 
metric and positive-definite , the problem ^ is convex. However, in 



most cases of practical interest Q is neither symmetric nor positive- 
definite and, thus, the ^ is nonconvex. Therefore, there is no hope 
in finding systematically the global optima of The SISAL algo- 
rithm, we introduce below, aims at "good" sub-optimal solutions. 

Our first step is to simplify the set of constraint QY = 1^. 
We note that the vector 1^ does not belong to the null space of Y. 
Otherwise, the null vector would belong to the the affine hull of M, 
what would imply that the columns of M would not be independent. 
Therefore, by multiplying the equality constraint on the right hand 
side by Y^(YY^)-\ we get (ijQY = 1^) ^ (ijQ = a^), 
where = 1„Y^(YY^)~^. The problem l|3j simplifies, then, to 



Q* — arg min — log | det(Q) 
Q 

St. : QY >z 0, IpQ = a^. 
Instead of solving Q, we solve the following modified version: 



Q* 



s.t. 



arg min 
ljQ = a 



-log|det(Q)|+A||QY|U 



(4) 



(5) 



where ||X||h = "^^ij h{['K]ij) and h{x) = max{— a:,0} is the so- 
called hinge function. Notice that ||QY||h penalizes the negative 
components of QY proportionally to their magnitude, thus playing 
the rule of a soft constraint or a regularizer. The amount of regu- 
larization is controlled by the regularization parameter A > 0. As 
already referred to, the soft constrained formulation yields solutions 
that are robust to outliers, noise, and poor initialization. Furthermore, 
by replacing n x p equality constraints by a regularizer, it opens the 
door to deal with large scale problems. 

3. SEQUENCE OF CONVEX SUBPROBLEMS 

Let q = vec(Q) denote the operator that stacks the columns of Q in 
the column vector q. Given that vec (AB) = (B^ (g) I) vec (A) = 
(I eg) A) vec(B), where (g) denotes the kronecker operator, and defin- 
ing /(q) = — log I det(Q)|, then l|5) may be written as 



s.t. : 



argmin /(q) + A||Aq||h 
q 

Bq = a, 



(6) 



'x >; y means Xi > yi for i = 1, . . . , p; 1 J = (1, . . . , p). 



where A = (Y^ g) I) and B = (I g) ij). The Hessian of / is 
H — K„[Q~^ g) Q~^], where K„ id the comutation matrix (i.e., 
K„vec(A) — vec(A"^)). Since H has positive and negative eigen- 
values, the above problem in nonconvex and thus hard. 

Using a quadratic approximation for /(q), we approximate l|6j 
by computing a descent sequence q^, fc = 0, 1, . . . , with the follow- 
ing algorithm: 

Algorithm 1 Sequence of strictly convex subproblems 

1. Set fc = 0, choose > and qo = VCA(Y). 

2. repeat 

3. /fc = /(qO+_A|lAq,|U 

4. g = -vec (Q ^) 

5. qfc+i G argminqg^q + ^||q - qfelp + A||Aq||h 

6. s.t.: Bq = a 

7. if/(qfc+i)+A||Aq,+i|U >Zfe 

8. find q G {aqfc+i + (1 - Q)qfc : < a < 1} 

9. such that /(q) + A||Aq||h < h 

10. qfe+i = q 

11. + 1 

12. until a stopping criterion is satisfied. 

Algorithm 1 is initialized with the VGA estimate. Line 4 
computes the gradient of /(q). Line 5 minimizes a strictly convex 
approximation to the initial objective function, where the term /(q) 
was replaced by a quadratic approximation. The term ||q — q^lP 
ensures that ||qfc+i — qfc|[^ does not grow unbounded. Lines 7 to 
10 ensures that the objective function does not increase. To solve 
the minimization 5-6, we introduce in the next subsection a variable 
splitting augmented Lagrangian algorithm. 



4. VARIABLE SPLITTING AND AUGMENTED 
LAGRANGIAN 

The optimization problem 5-6 of Algorithm 1 is equivalent to 

min E{q,z) (7) 

q.z 

s.t. : Bq — a, Aq = z, 

where 

In Q, the variable q was split into the pair (q, z) and linked through 
the constraint Aq — z. The so-called augmented Lagrangian (AL) 
for this problem, with respect to the constraint Aq = z is given by 

/:(q,z,d,r) = £(q,z) + a^(Aq-z)-f r||Aq-zf (8) 
= £(q,z)+r||Aq-z-df -he, (9) 

where a is holds the Lagrange multipliers, d = — q;/(2t), and c 
is an irrelevant constant. The AL algorithm consists in minimizing 
£ with respect to (q, z) and then updating ct, or, equivalently d, as 
follows 

Algorithm 2 Augmented Lagrangian Algorithm 

1. Set t — 0, choose (qo, zo), ao, and r > 0, 

2. repeat 

3. (qt+i,zt+i) e argminz-C(q, z,dt,r) 

4. S t.: Bq = a 

5. dt+i ^ dt - (Aq^^i - zt+i) 

6. + 1 

7. until a stopping criterion is satisfied. 



where soft- (a;, /3) = (max{|a;-|-/3/2| -/3/2, 0})(x/!a;|) is athresh- 
olding/shrinkage function and, for a matrix X, soft_(X, /3) is the 
componentwise application of soft_(-,/3). We note that computa- 
tions l IlQt and l ll2t are very light: In the first case, all matrices in- 
volved are x and are not iteration dependent. In the second 
case, the function soft- (•) takes negligible time. 

The iterations of Algorithm 3 are much faster than that of Algo- 
rithm 2. There is, however, the question of convergence. The answer 
to this question turns out to be positive, a result that can be proved 
via the equivalence between the alternating split AL algorithm just 
described and the so-called Douglas-Rachford splitting method, ap- 
plied to the dual of problem l[7]l; see 1 23 , Theorem 8], [,2U . 

In conclusion, the pseudo-code for SISAL is given by Algorithm 
1 with the step 5 replaced by Algorithm 3. 

5. EXPERIMENTAL RESULTS 

This section presents results obtained with SISAL, MVSA (hard ver- 
sion) 1 15 1, MVES 1 14 1, and VCA |4| apphed to simulated data sets. 
The SISAL regularization parameters was set to A = 10, The re- 
maining parameters were set to r = 1 and pi — 10^*. Although 
these values may be far from optimal, they led to excellent results. 
The data was generated according to the linear observation model 
([T). The abundance fractions are Dirichlet distributed with parame- 
ter = 1, for i = 1, . . . ,p. The mixing matrix M is randomly 
generated with i.i.d. uniformly distributed elements. To ensure that 
no pure pixel is present, we discarded all pixels with any abundance 
fractions larger than 0.8. The signal-to-noise ratio (SNR) defined as 
||Y|||./||N|||., where |[ • |[|- denotes the Frobenius norm and N is 
zero-mean Gaussian additive noise, was set to SNR=40dB. 



It has been shown that, with adequate initializations, the AL al- 
gorithm generates the same sequence as a proximal point algorithm 
(PPA) applied to the Lagrange dual of problem Q; for further details, 
see I20II21I and references therein. Moreover, the sequence d^, for 
k = 0, 1, ... , converges to a solution of this dual problem and all 
cluster points of the sequence Zfe, for fc = 0, 1, . . . , are solutions of 
the (primal) problem Q |20|. 

The exact solution of the optimization with respect to (q, z) in 
the line 3 of the Algorithm 2 is stil a complex task. However, the 
block minimizations with respect to q and with respect to z are very 
light to compute. Based on this, we propose the following modifica- 
tion of Algorithm 2: 

Algorithm 3 Alternating Split AL 
1. Set t — 0, choose (qo, zo), ao, and r > 0, 
repeat 

T„ , Ml, 1,2 T 



qt+i€argming q+-|lq-qfci| +-||Aq-Zt-dt 
q 2 2 

s.t.: Bq = a 

1 2 

Zt+1 G argmin -||Aq^ , - z - dt|| + -||z||h 

z Z T 

dt+1 ^ dt - (Aq^^;^ - zt+i) 
until stopping criterion is satisfied. 



The solution of the quadratic problem with linear constraints 3-4 is 

qt+i = F^^b - F"^B^(BF"^B^)"^(BF"^b - a), (10) 
where 

(11) 



F 

b 



(pi + rA^A) 
Mqt - g + TA^(zt + dt). 

The minimization with respect to z, in line 5, is, by definition 
the proximity operator of of the convex function ||z||;i |22|, which is 
similar to the soft threshold function but applied just to the negative 
part of its argument: 




V(1,:) 

(a) 




zt+i =soft-(Aqt+i -dt,/i/T) 



(12) 



(b) 



Fig. 2. Unmixing results for (a) p = 3 and (b) p = 20 number of end- 
members for SISAL, MVSA, and MVES algorithms. Dots represent 
spectral vectors; all other symbols represent inferred endmembers by 
the unmixing algorithms. The unmixing problem with n = 10000 
spectral vectors and p = 20 endmembers is far beyond the reach of 
MVSA, and MVES. 



Table 1. Comparison of SISAL, MVSA, and MVES algorithms for 
different number of endmembers and sample size n = 10000. The 
time is in seconds, the symbol "*" means the algorithm ran out of 
memory, while f indicates that the algorithm was aborted before con- 
verging. Note the 0{np) time complexity of SISAL. 
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Fig. |2]shows a projection on a subspace of the true endmembers, 
of the SISAL, MVSA, and MVES estimates and the spectral vectors. 
The data set has size n — 10000 and a number of endmembers p = 3, 
(part a), and p = 20, (part b). In part b, we just plot SISAL results 
because MVES takes hours for p > 10, and MVSA exhausts the 
memory forp > 12. 

Notice the high quality of SISAL, MVSA, and MVES estimates 
in both scenarios. This is not the case with VCA, as it was not con- 
ceived for non-pure pixel scenarios. VCA plays, however, a valuable 
rule in the SISAL, MVSA, and MVES initializations. 

Table[T]shows the times in seconds and the Frobenius norm ||e||F 
of the endmember error matrices M — M. The experiments were 
performed on an PC equipped with a Intel Core Duo 3GHz CPU and 
4 GB of RAM. The errors are comparable. However, MVES takes 
much longer and we could not run it for more than p = 10. MVSA 
ran out or memory for p > 10. The time SISAL takes is well approx- 
imated by a 0{np) bound, what could be inferred from its structure. 

6. CONCLUSIONS 

SISAL, a new algorithm for hyperspectral unmixing method of min- 
imum volume class, was introduced. The unmixing is achieved by 
finding the minimum volume simplex containing the hyperspectral 
data. This optimization problem was solved by a sequence of vari- 
able splitting augmented Lagrangian optimizations. The algorithm 
complexity is 0{np), where n is the number of spectral vectors and 
p is the number of endmembers what is much faster than the previous 
state-of-the-art, allowing to solve problems far beyond the reach of 
SISAL's competitors. 
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